-
Notifications
You must be signed in to change notification settings - Fork 2k
bzImage / initrd support (v2) #670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bzImage / initrd support (v2) #670
Conversation
ef1a12a
to
e485e17
Compare
Changes look good. Please add integration tests as these are new functionalities. Add an entry in the changelog too as they're customer facing. |
I am working on integration tests -- it took a little longer to get the env up and running than I expected, but I should have something this week. |
e485e17
to
35edf1a
Compare
Rebased, and added integration tests and changelog entries. The integration tests show as 'error' because the matching images are not in the spec.ccfc.min bucket so the tests can't be run. They pass for me when running with TEST_MICROVM_IMAGES_S3_BUCKET=tdeegan.fc.test - that bucket has extra images img/bzimage/ and img/initrd/ . What's the right way to merge those in to spec.ccfc.min? |
35edf1a
to
a094553
Compare
Noop push to trigger the tests again. |
@@ -110,6 +110,7 @@ s3://<bucket-url>/microvm-images/ | |||
<microvm_test_image_folder_n>/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're touching/fixing this file, please also update the S3
bucket name from spec.firecracker
to spec.ccfc.min
, also the tree
has changed from:
s3://<bucket-url>/microvm-images/
to
s3://<bucket-name>/img/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, updated.
a094553
to
71f4031
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good. I was expecting the boot time to be larger when using bzimage, but not that much larger 😄 .
Not matrial for this PR, but what compression format is the new image in? I think some formats might actually be faster than the uncompressed binary. |
Tries loading as ELF first, then falls back to bzImage. bzImage is slower to boot than ELF, because the kernel has to decompress itself. Added an init_size field to the bootparams block, which the decompressor uses to know how much space it has to work with. Made it big enough for a moderate sized kernel. We really ought to be extracting the bootparams block from the kernel instead. Signed-off-by: Tim Deegan <[email protected]>
Plumb through an optional initrd path from the REST API, and load it at a fixed address. Signed-off-by: Tim Deegan <[email protected]>
Add plumbing for test VMs that have initrd and no rootfs, and test the boot time of both bzImage and bzImage+initrd configurations. They need longer timeouts than the vmlinux test because they both involve decompression at boot time. Signed-off-by: Tim Deegan <[email protected]>
The kernel and initrd are both gzip'd. I didn't try other compression formats, but I agree that something like lz4 might go faster. |
7b273c5
1175b13
to
7b273c5
Compare
Thanks all for the reviews! I've rebased against current master. The only conflicts were in the changelog and a line of context in test_boottime.py; no other changes since last time. |
Tim, apologies for showing up here at this late stage, but in the last days I was thinking that we should stick to our minimalism tenet here, and only add additional supported load protocols when there’s a clear requirement form a customer / user. Since right now there does not seem to be demand for this, I don't this is something we should merge at this time. So, to be consistent with our mission, there are two options:
Again, sorry for waking up to this so late in the game, I didn't think about it until recently. |
Totally understood. No point in having code in there that nobody needs. I'll drop this for now and it can be revived if there's more demand. |
I would like to see this feature. |
@rgooch is there a reason the rootfs drive option doesn't work for you? |
@raduweiss: I have OS images which contain an initrd. I would like to be able to boot those with Firecracker instead of QEMU. |
The thing with additional boot process features like [1] https://github.com/firecracker-microvm/firecracker/blob/master/SPECIFICATION.md |
Yes, I agree it can be worked around. However, we have a lot of people building conventional VM images which use initrd. It would be a lot of engineering work for them to get off of initrd, especially considering that basically all the major Linux OS vendors depend on initrd. It's a major effort to get people to start with something else. I realise that you want to keep Firecracker small. That said, with initrd support, I could boot most Linux VMs with Firecracker rather than QEMU. So close... |
@raduweiss I agree that we'd like to keep Firecracker (the binary) small and fast and bzImage/initrd will increase boot time due to need of decompression. However, I also view Firecracker, the repository a bit like a toolbox. In general, what do you think of including bzImage/initrd support as an optional/non-default feature which can be enabled at compile time? Booting bzImage/initrd is just super common and very convenient as they are easy to create/extract. |
I like the toolbox approach. Another possibility would be a plugin approach. |
Converting a bzImage to an elf image for firecracker is pretty easy. This sctipt does the job. The These are extra steps so having support for bzIMage/initrd optional should be sufficient. |
Creating a file-system (ext4) image isn't cheap. I measured around 130 ms to |
I wasn't suggesting to create the root fs on the fly, but more to do a one-off conversion if you have a initrd. Do you have a specific use case in mind where it is necessary to convert it on the fly? |
We have a model where we don't store disc images, instead we store the directory structure and each regular file as a unique object. This is more efficient when there are many different images which have a hierarchy of derivations. When a VM is created, the root volume is created on-the-fly. The size of the volume is the size of the image plus a specified amount of free space. The SmallStack Hypervisor performs this on-the-fly root volume creation. Pre-computing a root volume would imply that the volume (and thus FS) would need to be resized when the VM is created. A bigger issue is that we'd have to pre-compute all these root volumes and that would be expensive to store and distribute, as there would be no de-duplication of file objects. |
Can we test how fast would it boot with lz4 images? I don't see any harm in having this support in Firecracker if it makes sense from a boot time perspective (as in it has similar boot times that we currently achieve with elf). |
I was going to run some tests but haven't got around to it yet. There is another option, which is to decompress the kernel as part of the load (rather than boot). If I understand the code right, this PR currently puts the bzImage into memory and then it get's decompressed as part of the boot. But we could also parse the image based on the format specified here, extract the vmlinux image from it, and load that. This way the boot time should not be affected, but the REST call for passing the image might be slower. A while back I implemented a parser/extractor for bzImages in Go which may provide a hint on how to do this in Rust. |
From my perspective, support for bzImage format would be convenient, but initrd support is more important to me. I can decompress and unpack the bzImage externally. I can even keep a small cache of extracted vmlinux files to amortise the unpacking cost. |
Hi, I am bumping this PR, since I am seeing this feature listed in the 2020 Roadmap. The initrd has the big advantage of allowing using a generic kernel, with few CONFIG_'s built in, with no need to support different rootfs filesystems. @tim-deegan-aws would you consider resuming this? If you need I'd be glad to help / spend some time on this to have it upstream. |
@marcov this PR also adds support for bzImage. For now we only agreed on adding initrd path support. I am not sure @tim-deegan-aws is still active on GitHub, so I wouldn't wait for too long on a response. |
@andreeaflorescu: Are you saying that what's stalling the addition of initrd support from this PR is the bzImage support? |
@rgooch Yap, a PR that only adds initrd support would be easily merged because that's what we also have on our roadmap. |
Opened a dedicated PR for initrd support here: #1246 |
@marcov thanks, we will take a look. |
This is a rebase of #428 - after various repo shuffles I'm not able to update that PR with the new version.
It adds support for loading bzImage kernels and initrds as an alternative to ELF64 kernels. It's slower than the vmlinux format because the kernel must decompress itself, but useful for trying kernels that I didn't build myself.
For now, it loads the initrd at a fixed address; it would be better to have a way of finding a free area of memory after the kernel has loaded.